Synthetic Biology — Latest Matching Preprints

1

Barcoded-Plasmid DNA library construction for recording cell lineage trees enabled by a Scalable and modular Biofoundry-based Automated Robotic Pipeline

Tassinari, E.; Ives, L.; Hawkins, E.; Annese, D.; Fonseca, S.; Lan, Y.; Haerty, W.; Wojtowicz, E.; Grandellis, C.

2026-07-08 synthetic biology 10.64898/2026.07.07.736956 medRxiv

Top 0.1%

22.4%

Show abstract

High-quality plasmid DNA purification at high throughput remains a significant bottleneck in molecular biology and bioengineering. Current methods frequently fail to deliver sufficient yields of pure, transfection-grade DNA required for genetic engineering applications in mammalian cells. Here, we present a Biofoundry-based automated pipeline using the CyBio FeliX robotic liquid handling platform to rapidly purify plasmid DNA with minimal manual intervention. The protocol leverages Solid Phase Reversible Immobilisation (SPRI)-based magnetic bead technology to ensure consistency, scalability, and DNA purity suitable for downstream viral particle production and mammalian cell transfection. The pipeline supports flexible processing of between 8 and 96 samples per run, making it adaptable across a wide range of experimental scales. The protocol is openly available via Earlham Institute GitHub repository, enabling broad adoption across the bioscientific community and contributing to the growing toolkit of reproducible, scalable engineering biology workflows. In this work, we employed an integrated robotic pipeline to process 528 pooled DNA plasmids and built a Lentiviral DNA plasmid library for lineage tracing, validated the library by sequencing, and demonstrated efficacy in downstream mammalian cell transfection experiments.

2

MozClo: An Expanded MoClo Toolset for Large Multigene Assembly and Plant Transformations

Straub, G.; Aldrich, D.; Tobin, C.

2026-07-10 synthetic biology 10.64898/2026.07.09.737387 medRxiv

Top 0.1%

7.5%

Show abstract

The Modular Cloning (MoClo) and PhytoBrick standards have revolutionized plant synthetic biology by establishing a standardized, hierarchical assembly grammar. However, as the engineering of complex metabolic pathways, multi-trait stacks, and synthetic gene circuits expands, existing toolkits hit practical boundaries in assembly capacity and fixed grammars. To overcome these bottlenecks, we present MozClo, an expansion of the MoClo/PhytoBrick architecture. MozClo expands the standard Level 1 assembly framework to 10 positions using new L1 acceptors, end-linkers and dummy parts. We also identify and resolve a critical, sticky-end collision at L1 position 7 that has caused assembly failures during L2 cloning of large plasmids. To address commercial DNA synthesis length constraints and to lower cloning costs, we designed a universal 5-in-1 gene fragment multiplexing system. This architecture embeds up to five distinct parts flanked by orthogonal pairs of BpiI restriction sites into a single synthesized fragment, allowing them to sort independently into their respective L0 acceptor plasmids while maintaining complete modular flexibility of part types. Finally, we provide Level 2 cloning backbones with built in selection genes for common soybean transformation methods to facilitate downstream plant selection. Together, these advancements reduce DNA synthesis overhead and accelerate the construction of complex multigene payloads for plant biotechnology.

3

Ontology-driven software engineering using LLMs for knowledge graphs in engineering biology

Medeni, I. T.; Ünal, M.; Galizi, R.; Bartley, B.; Beal, J.; Myers, C. J.; Vaidyanathan, P.; Mısırlı, G.

2026-05-30 synthetic biology 10.64898/2026.05.29.728869 medRxiv

Top 0.1%

7.2%

Show abstract

Large language models have transformed software engineering practices. However, generated artefacts are not always developer-friendly and may partially meet complex requirements. As the need to standardise, integrate, and develop tools in engineering biology increases, novel approaches are needed to create and maintain intuitive software sustainably. Here, we present an ontology-driven approach using large language models to create user-facing software libraries for knowledge graphs. We introduce an ontology-to-language framework to systematically map domain terms and graph structures. We then demonstrate this approach by creating an ontology for the latest Synthetic Biology Open Language standard and generating the sbol-script software library, which can be used within browsers or to develop applications with native web support. This ontology-driven software engineering approach and these resources are essential for the community and to facilitate the development of sustainable software projects. The SBOL3 Ontology and the sbol-script library are available from https://github.com/SynBioDex/sbol-owl3 and https://github.com/SynBioDex/sbol-script.

4

CRISPRi-assisted E. coli strains increase success rate of burdensome construct cloning

Faulkner, I.; Kiattisewee, C.; Darst, B.; Leejareon, P.; Yoshikuni, Y.; Zalatan, J. G.; Carothers, J. M.

2026-06-03 synthetic biology 10.64898/2026.06.02.729553 medRxiv

Top 0.1%

6.0%

Show abstract

Genetic constructs meant for metabolic engineering in nonmodel microbes often use similar genetic parts to those familiar to E. coli work. The typical workflow is to clone these parts into plasmids in E. coli before they are transferred to the nonmodel host or its genome. In many cases, the metabolic burden of these constructs is stronger in the E. coli cloning phase of the workflow than in the eventual host, possibly resulting in mutation or other failure during cloning. Here, we apply generic knockdown of a range of popular expression systems, using CRISPR interference, by targeting guide RNAs to either promoters or RBSs that are commonly used in metabolic engineering. Generic targeting of a constitutive promoter series, combined with genome integration of CRISPR components, allows the use of only one or a few specific cloning strains to achieve strong knockdown of a wide range of constructs. Further, we present a recombinase-based workflow for easily adding guide RNAs with custom targets, so that users can knock down any desired promoter or ORF. Together, this group of strains comprises easy-to-use cloning strains meant for increasing success rates of difficult or burdensome cloning reactions, ultimately allowing more ambitious genetic constructs to reach their intended context.

5

AI-assisted improvement of Aspergillus oryzae β-galactosidase using an Ensemble of Protein Language Models

Trapote Fernandez, A.; Fernandez, A.; Mendez-Liter, J. A.; Prieto, A.; Barriuso, J.; Osorio, F. G.

2026-05-21 synthetic biology 10.64898/2026.05.20.726739 medRxiv

Top 0.1%

5.4%

Show abstract

{beta}-galactosidases (BGs) are essential enzymes widely used in the food industry, particularly in the production of lactose-free products. Among them, the BG from Aspergillus oryzae is of industrial relevance due to its activity at acidic pH and moderate thermal tolerance. However, enhancing its catalytic performance remains a key challenge. Traditional enzyme engineering methods are time-consuming and resource-intensive, limiting their scalability. Recent advances in Artificial Intelligence (AI), particularly those based on Natural Language Processing, offer a promising alternative by enabling efficient exploration of protein sequence space and prediction of beneficial mutations. In this study, we introduce an ensemble-based, zero-shot Protein Language Model pipeline that reconciles predictions from six independent models (ESM2 and the five ESM1v variants) combined with a diversity-aware candidate selection strategy. Applied to the BG from A. oryzae, this approach identified beneficial mutations leading to novel enzyme variants with up to a four-fold increase in catalytic efficiency on oNPGal, a two-fold increase on lactose, and, independently, a T338I variant with markedly enhanced thermostability ({approx}80% residual activity after 24 h at 60 {degrees}C), all without requiring supervised fine-tuning on experimental fitness data. Our results demonstrate that consensus across an ensemble of PLMs can efficiently enrich beneficial substitutions in industrially relevant enzymes and substantially reduce the number of wet-lab candidates that need to be screened. Table of Contents graphic O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=106 SRC="FIGDIR/small/726739v1_ufig1.gif" ALT="Figure 1"> View larger version (29K): org.highwire.dtl.DTLVardef@18084f7org.highwire.dtl.DTLVardef@99a102org.highwire.dtl.DTLVardef@19a64forg.highwire.dtl.DTLVardef@1f59cff_HPS_FORMAT_FIGEXP M_FIG C_FIG

6

A two-step selection method for in vitro evolution of translational proteins

Sakurai, A.; Shoji, K.; Ichihashi, N.

2026-05-10 synthetic biology 10.64898/2026.05.09.724044 medRxiv

Top 0.1%

4.0%

Show abstract

Improving the reconstituted translation system is a key requirement for bottom-up synthetic biology. Here, we developed a two-step in vitro evolutionary method that can be used for improving translational proteins. In this method, two distinct conditions were sequentially applied while maintaining genotype-phenotype linkage in water-in-oil droplets. Using this method, we performed in vitro evolution of four translation factors, IleRS, PheRS, EF-G, and EF-Tu, and identified mutations that modestly enhanced translation activity in in vitro expression assays. One of the EF-G mutations (P610S) increased activity per protein approximately 2-fold for the recombinant protein purified from E. coli. This selection method is useful for improving translational proteins for bottom-up synthetic biology.

7

A Low-Cost, High-Throughput Design-Build-Test Pipeline for Engineering Genetic Systems: Stress Testing with Complex Structural Proteins

Adamson, H. E.; McLellan, J. R.; Singhal, K.; Demirel, M. C.; Salis, H. M.

2026-06-09 synthetic biology 10.64898/2026.06.08.729977 medRxiv

Top 0.1%

3.5%

Show abstract

Genetic systems engineering is constrained by high DNA synthesis costs, assembly inefficiencies, and challenges in expressing complex proteins. To address these limitations, we developed a highly parallel, low-cost pipeline for the design, assembly, and functional screening of genetic systems, which we stress-tested on highly repetitive structural proteins, including spider silk, biocements, reflectins, and talins. The integrated pipeline combines computational genetic systems design, low-cost many-plasmid DNA assembly from oligopools, automated many-to-many mapping using nanopore sequencing data, and a label-free biosensor to measure single-cell protein expression levels. We applied this pipeline to build 240 plasmids, achieving an 88% success rate (up to 2000 bp) using standard clonal isolation and 58% assembly efficiency (up to 5600 bp) without selective DNA purification, while lowering material costs by up to 24-fold. We applied the biosensor to identify genetic factors that create distinct cellular subpopulations with varying protein expression levels. Overall, the integrated pipeline will dramatically lower the cost of high-throughput synthetic biology, while demonstrating how designing genetic systems to improve build efficiency ("design for build") and directly incorporating biosensors into genetic systems ("design for test") will greatly accelerate design-build-test workflows.

8

Storing >1 byte of information in 16S ribosomal RNA using orthogonal trans-splicing ribozymes

Dysart, M. J.; Fang, L.; Karinje, L. K.; Chappell, J.; Stadler, L. B.; Silberg, J. J.

2026-07-15 synthetic biology 10.64898/2026.07.14.738544 medRxiv

Top 0.1%

3.3%

Show abstract

TEXT ABSTRACTCatalytic-RNA (cat-RNA) expressed from mobile DNA can record cellular events, such as the uptake of plasmids via horizontal gene transfer, by splicing a barcode onto 16S ribosomal RNA (rRNA) - a system termed RNA addressable modification (RAM). However, scaling RAM to record multiple simultaneous biological events requires large numbers of orthogonal cat-RNA whose signals reflect the biological features under investigation rather than variability arising from the barcode sequence. Here, we explore how to design orthogonal cat-RNA to record information about multiple plasmid-encoded traits in parallel. We show that cat-RNA having tRNA-derived barcodes with sequence variation in the anticodon stem-loop present greater signal consistency within Escherichia coli than mRNA-derived barcodes. When orthogonal cat-RNA designs harboring tRNA-derived barcodes were evaluated in Vibrio natriegens and Pseudomonas putida, increased variance was observed compared with Escherichia coli. Nevertheless, the signal consistency was sufficient to use these orthogonal cat-RNAs to report on the relative activities of four promoters and two origins of replication by sequencing barcoded-rRNA derived from the three organisms. These results show how RAM can be multiplexed to report on mobile DNA features in microbial communities and illustrate the importance of accounting for variability in RNA outputs when designing and interpreting multiplexed RNA barcoding data. GRAPHICAL ABSTRACT O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=88 SRC="FIGDIR/small/738544v1_ufig1.gif" ALT="Figure 1"> View larger version (29K): org.highwire.dtl.DTLVardef@406ebaorg.highwire.dtl.DTLVardef@259751org.highwire.dtl.DTLVardef@1f1512corg.highwire.dtl.DTLVardef@8384b_HPS_FORMAT_FIGEXP M_FIG C_FIG

9

3' Exonuclease-mediated DNA assembly at room temperature and below

Irving, O. J.; Khan, C. J.; Albrecht, T.

2026-07-08 synthetic biology 10.64898/2026.06.17.732819 medRxiv

Top 0.1%

3.3%

Show abstract

DNA assembly is a cornerstone of synthetic biology, enabling the construction of bespoke genetic systems for applications ranging from metabolic engineering to DNA nanotechnology. Conventional Gibson Assembly (GA), the most widely used method, relies on 5' exonucleolytic resection and elevated temperatures ([~]50 {degrees}C), which together prevent the retention of 5' modifications and restrict compatibility with temperature-sensitive functionalities. Here, we report a DNA assembly strategy, 3 exonuclease-mediated low-temperature DNA assembly (3LTDA), which generates complementary 5' overhangs while preserving 5' end integrity. This approach enables the efficient assembly of blunt-ended, 5'-functionalised DNA fragments into both linear and circular constructs at ambient temperature (21 {degrees}C), with some assembly observed at temperatures as low as 4{degrees}C. We systematically optimise reaction conditions and demonstrate that this method supports efficient plasmid re-circularisation and multi-fragment assembly, including the construction of a [~]12.5 kbp plasmid from multiple DNA components. Comparative analysis across several DNA substrates shows that, under their respective optimal conditions, this approach matches or exceeds GA performance, improving assembly efficiency by up to 12.8%. Sequence analysis confirms high fidelity with no detectable base-pairing errors across assembled junctions. Crucially, this method preserves chemically functionalised 5' termini, enabling downstream conjugation and biochemical functionality. Retention of azide and biotin modifications was verified through fluorescence imaging, bead-based co-localisation, and enzymatic activity in ELISA-based assays. This is in contrast to GA-assembled controls, which showed complete loss of functionality under comparable conditions. We further assembled 5 kbp dsDNA using 3LTDA from four independent segments, three with different fluorescence reporters, and the fourth containing a biotin group for microparticle conjugation, each on the 5 end. Under fluorescence illumination, bead-bound DNA with all three fluorescence markers were detected. Conventional GA assembled constructs, on the other hand, failed to retain the reporter groups and the fluorescent images did not show the presence of any fluorescent markers. In addition to enhanced performance, the method could also reduce reagent cost and eliminate the need for elevated temperatures, simplifying workflows and expanding the applicability of multi-functionalised DNA constructs. Collectively, this work establishes 3LTDA as a robust, low-temperature alternative to conventional GA, with advantages for applications requiring precise chemical modification, temperature-sensitive components, or deployment outside conventional laboratory environments.

10

FASTOP - Fast editing toolkit for top expression sites in yeast

Borah, M.; Gautron, N.; Courdavault, V.; Naseri, G.

2026-05-08 synthetic biology 10.64898/2026.05.07.723299 medRxiv

Top 0.1%

3.3%

Show abstract

Budding yeast Saccharomyces cerevisiae is a workhorse chassis for producing added food and agricultural compounds. However, building multi-enzymatic pathways for these chemicals often requires iterative genomic integration, underscoring the need for efficient, rapid genome-editing tools that can reliably target transcriptionally active chromosomal regions. In this study, to accelerate strain construction, we established a genome-editing toolkit to rapidly engineer eight loci, highly expressed hot-spots, but nonessential genomic sites suitable for stable pathway assembly. Our approach integrates three key design features: (i) selectable markers to enable rapid screening of edited cells, (ii) extended homology arms that leverage the yeast homology-directed repair machinery for robust genomic integration, and (iii) co-delivery of Cas9 and guide RNAs to promote efficient double-stranded DNA breaks at specific integration sites. The sequence independence of FASTOP relies on the release of integration cassettes from integrative vectors, mediated by restriction digestion at two flanking multiple-cutting sites in the integration module to minimize the risk of introducing sequence errors during PCR amplification of the integration cassettes. Following the introduction of a fluorescent reporter cassette, we observed high integration efficiencies across the target sites. We then integrated the biosynthetic pathway of plant-derived flavonoid naringenin into the hot-spots of the yeast genome using the FASTOP toolkit. Our results demonstrated that upon expressing the five essential genes in simple shake flask culture, naringenin production reached 505.7 mg/L, representing a significant (69-fold) increase over previously reported titers for comparable minimal heterologous pathways in S. cerevisiae. Together, the FATSOP toolkit provides a user-friendly platform for reliably modifying hot-spot loci to rapidly construct multi-enzymatic metabolic pathways in S. cerevisiae, while achieving high production levels for high-value food-relevant metabolites.

11

PhAGE Enables One-Step Genome Integration of Large DNA Fragments in Escherichia coli

Nozaki, S.; Miwa, Y.

2026-04-24 synthetic biology 10.64898/2026.04.23.720475 medRxiv

Top 0.1%

3.2%

Show abstract

Escherichia coli is a well-established model organism in molecular biology and biotechnology. Despite its long history as a laboratory workhorse, the efficient single-step chromosomal integration of large DNA fragments remains a challenge. Currently known methods are either simple but have limitations on insert size, or flexible but laborious requiring plasmid construction or multi-step procedures. Here, we present PhAGE (Phage-Assisted Genome Engineering), which enables the integration of [~]20 kb DNA fragments into E. coli genome within a single day. PhAGE method uses in vitro packaging of recombinant DNA into bacteriophage capsids, followed by general transduction to introduce pre-assembled DNA with flanking homology arms into recipient cells. This approach allows efficient and landing pad-free integration of large constructs into the target loci. We demonstrate its usefulness through rapid integration of multi-gene operons. PhAGE resolves the long-standing trade-off between simplicity and insert size in E. coli genome engineering, accelerating strain construction across a wide range of applications, from biosynthetic pathway engineering to genome-scale design.

12

Post-translational modification fidelity of recombinant human lactopontin expressed in Kluyveromyces lactis

Excell, J.; Giardina, A.; Sakamoto-Rablah, E.; Royle, K.; Nunn, D.

2026-05-12 synthetic biology 10.64898/2026.05.12.724256 medRxiv

Top 0.1%

3.1%

Show abstract

Recombinant human lactopontin (rhLPN), an equivalent of human milk lactopontin, is of increasing interest for human nutrition applications due to its roles in mineral binding, gastrointestinal function and immune modulation. These properties depend strongly on post-translational modifications, particularly phosphorylation and glycosylation. Here, we report the production of rhLPN in Kluyveromyces lactis at laboratory and pilot scale and present a comprehensive molecular comparison with native human lactopontin (nhLPN) isolated from human milk. Mass spectrometry-based peptide mapping confirmed the primary structure and identified extensive phosphorylation, consistent with the native protein. Middle-up analyses demonstrated closely matched phosphoform distributions between rhLPN and nhLPN, while glycosylation profiling revealed a defined population of low-complexity O-glycoforms localized to the N-terminus. Functional assessment demonstrated substantially greater iron binding by phosphorylated rhLPN compared with dephosphorylated and non-phosphorylated forms. Similar phosphorylation-dependent behaviour was observed for bovine lactopontin, supporting a conserved role for phosphorylation in mineral interaction. Across five 750 L pilot scale batches, both phosphorylation and glycoform distributions were highly consistent, indicating robust process reproducibility. Together, these results demonstrate that rhLPN produced in K. lactis recapitulates key structural and functional attributes of nhLPN, supporting its suitability as a scalable ingredient for nutrition applications.

13

Machine learning guided cell-free expression maps the biochemical landscape of carbonic anhydrase

Lazar, J. T.; Komp, E.; Martinez, I.; Zolkin, K.; Notin, P. M.; Saleh, S.; Landwehr, G.; Kim, K.; Tian, A.; Shapero, B.; Karim, A. S.; Marks, D.; Beckham, G. T.; Jewett, M. C.

2026-07-08 synthetic biology 10.64898/2026.07.07.736810 medRxiv

Top 0.1%

2.5%

Show abstract

Carbonic anhydrases are among the fastest known biocatalysts, reversibly facilitating the hydration of CO2 to HCO3- at rates up to 107 s-1, which warrants their investigation for industrial carbon capture technologies. However, engineering carbonic anhydrases to maintain stability under harsh industrial process conditions remains a key challenge, and sequence-to-function datasets compatible with machine learning to inform forward engineering are lacking. Here, we developed a high-throughput platform that couples cell-free gene expression with a gaseous CO2 colorimetric assay to map the fitness landscapes of carbonic anhydrases. From 96 diverse natural homologs, we identified a robust variant from the Aquificota phylum and conducted an exhaustive mutational scan and functional assessment of this enzyme at 70C and 90C, covering >99% of all single-amino acid substitutions (totaling 4,365 mutations assayed in 39,285 reactions). This biochemical landscape was used to benchmark 22 zero-shot protein fitness models and identify critical mutations that improved enzyme stability at 90C by more than three-fold. We then used both zero-shot protein language models and supervised learning to filter 419 model-generated variants from a ProteinMPNN library of 100,000 sequences, leading to a best-in-class enzyme that retained activity after incubation at 95C. This work demonstrates that integrating cell-free enzyme engineering with machine learning enables opportunities for high-throughput experimental measurements to benchmark and improve protein language models, accelerate design loops, and expand functional exploration within protein families where experimental information is limited.

14

Towards autonomous biology: Compiler-Verified Protocols as a Foundation for Real World AI Execution

Song, R.; Fu, Y.; Zhao, Z.; Yu, J.; Yuan, Q.; Chen, C.-T.

2026-05-07 synthetic biology 10.64898/2026.05.05.720956 medRxiv

Top 0.1%

2.3%

Show abstract

Artificial intelligence has advanced from analyzing experimental data to autonomously generating hypotheses, designing experiments, and coordinating closed loop discovery. Yet the translation from computational reasoning to physical execution remains bottlenecked by the experimental protocol, which in biology still relies on ambiguous natural-language descriptions: a medium other engineering disciplines abandoned decades ago in favor of compiler verified specification languages. This deficit fragments reproducibility along three axes: protocol accuracy, pre execution verification, and cross platform portability. Existing formalisms address only subsets of these challenges, trading expressiveness for rigor, portability for standardization, or usability for provenance. Here we introduce the Biology Protocol Language (BPL), a domain specific language with a biology-native type system in which every quantity carries physical units, every reagent declares its physical form, and every container maintains compiler-tracked state, so that implicit assumptions must be stated explicitly and physically impossible operations are rejected at compile time. We further develop BPL-COGEN, a pipeline that couples a fine tuned 30 billion parameter language model with the deterministic compiler in a closed generate validate repair loop, iteratively correcting the translation from natural language SOPs to BPL through compiler diagnostics until all physical, dimensional, and state constraints are satisfied. On a benchmark of 300 published Nature Protocols papers, BPL COGEN achieved an overall fidelity score of 95.1 against the source protocols as ground truth. Wet-lab experiment and cross-platform validation in GFP expression library construction and HPLC to UHPLC method translation confirmed that a single BPL source yielded reproducible execution across manual and liquid handler assisted contexts. The results established a novel pipeline that generates compiler-verified protocols, which is an essential prerequisite for physically embodied AI in biology.

15

DropSynth-Gold: Golden Gate Assembly in Emulsions Extends Multiplexed Gene Libraries to Greater Lengths

Romanowicz, K. J.; Hinton, S. R.; Villegas, N. K.; Plesa, C.

2026-06-01 synthetic biology 10.64898/2026.05.29.728538 medRxiv

Top 0.2%

2.2%

Show abstract

The ability to synthesize longer genes at scale remains a central challenge in multiplexed gene synthesis. DropSynth is a pooled gene synthesis platform that enables highly multiplexed, compartmentalized assembly from microarray-derived oligonucleotides, but current implementations rely on polymerase cycling assembly (PCA), which constrains fragment number, construct length, and assembly fidelity. Here we present DropSynth-Gold, an evolution of the DropSynth platform that replaces PCA with Golden Gate assembly (GGA) performed within emulsion droplets. This modification preserves the core workflow, including bead-linked oligonucleotide capture and pooled processing, while altering only the assembly chemistry and computational oligo design strategy. Emulsion-based Golden Gate assembly enables directional, multi-fragment ligation within isolated droplets, followed by recovery and amplification of full-length constructs. As a proof of concept, we constructed six 384-member libraries spanning increasing construct lengths and fragment counts, including designs from 5x300-mer fragments to 12x350-mer architectures ([~]3 kb). DropSynth-Gold reliably assembled full-length constructs across all libraries. A direct comparison of a shared 5x300-mer library demonstrated comparable recovery and fidelity to PCA-based DropSynth, indicating that Golden Gate assembly can replace PCA without compromising assembly performance. These gains were achieved without increasing cost, workflow complexity, or turnaround time, expanding the accessible design space for multiplexed gene synthesis.

16

Split-Indigoidine synthetase as optical reporter for benchmarking protein-protein interactions

Gonschorek, P.; Schelhas, C.; Flakowski, M.; Schenk, L.; Podolski, A.; Bode, H. B.

2026-06-04 synthetic biology 10.64898/2026.06.03.729908 medRxiv

Top 0.2%

2.2%

Show abstract

Indigoidine is a blue pigment biosynthesized by a single-module Non-Ribosomal Peptide Synthetase (NRPS) using L-glutamine as substrate. Despite its potential as a colorimetric reporter, no such system has been established from it to date. We used a recently characterized interdomain fusion site located between its adenylation (A) and thiolation (T) domains to develop the Indi2GO system, which provides a naked-eye detectable and quantitative optical readout of transient and covalent protein-protein-interaction (PPI) in living cells. Indi2GO enables high-throughput benchmarking and optimization of PPI tools in a standard 96-well plate reader format, without requiring exogenous substrates, specialized equipment or complex analytical workflows. We demonstrate its broad applicability with three widely used protein-protein interaction tools: SYNZIPS, inteins, and the SpyTag:SpyCatcher system. We used Indi2GO to validate novel SYNZIP pairs, which we used in NRPS engineering, highlighting its applicability for the development of novel PPI-mediating tools in the context of NRPS engineering and synthetic biology.

17

Experimental Methods for CRISPR Enzyme Assays with Fluorescence Readout

Jiang, Q.; Avaro, A. S.; Bae, H.; Sorensen, A.; Santiago, J. G.

2026-06-03 biochemistry 10.64898/2026.06.03.729647 medRxiv

Top 0.2%

1.9%

Show abstract

Fluorescence-based CRISPR diagnostic assays have become a popular platform for nucleic acid detection due to their programmability, configurability, specificity, and compatibility with standard laboratory equipment. However, reported enzymatic kinetic rates and limits of detection for CRISPR trans-cleavage assays vary by several orders of magnitude across the literature. This variation in performance parameters is coupled with and exacerbated by inconsistent calibration, incomplete correction of measurement biases, and nonstandardized or incomplete data-analysis procedures. We present an experimental protocol and quantitative analysis framework for fluorescence-based enzyme assays using routine laboratory instrumentation, including thermocyclers and fluorescence microplate readers. Building on previous studies of CRISPR enzyme kinetics and fluorescence calibration, we describe procedures for flat-field and background correction; comprehensive fluorescence calibration including correction for inner-filter-effect; quantification and implications of reporter degradation; extraction of Michaelis-Menten kinetic parameters; and determination of assay limits of detection. We provide step-by-step experimental guidelines and open-source Python implementations for each stage of the workflow. Using representative Cas12 trans-cleavage datasets, we demonstrate that explicit fluorescence calibration and correction procedures substantially reduce systematic bias in measured kinetic rates and improve consistency between experiments. Our framework aims to establish standardized practices for quantitative fluorescence-based CRISPR assays and provides researchers with practical tools for reproducible kinetic characterization and rational assay design.

18

Fabrication and Use of a 32-Well LED-Embedded Microplate for Optogenetic Dynamic Control

Jaiswal, B.; Black, T.; Namboothiri, H. R.; Pochana, K.; Hu, C. Y.

2026-07-10 synthetic biology 10.64898/2026.07.08.737360 medRxiv

Top 0.2%

1.9%

Show abstract

Optogenetic control enables light-actuated regulation of gene expression and provides a programmable interface between living cells and electronic systems. However, routine prototyping of optogenetic constructs remains limited by infrastructure. Existing closed-loop platforms often require chemostats, microfluidics, robotic handling, or custom optical sensors, which can increase cost, reduce accessibility, or constrain measurement performance. Here, we present LEMOS 2.0, an updated LED-Embedded Microplate for Optogenetic Studies, a low-cost device for optogenetic stimulation and gene-circuit characterization inside standard off-the-shelf microplate readers. LEMOS 2.0 builds on the original LEMOS platform by increasing throughput from 16 to 32 microwells and reducing light leakage between adjacent microwells, allowing dark conditions to be used as an additional illumination state. The device consists of a 3D-printed frame, individually addressable LEDs positioned next to each microwell, a rechargeable battery, and an onboard microcontroller for Bluetooth-based wireless communication. Biocompatible polydimethylsiloxane microwells are cast directly into the device by replica molding, allowing bacterial cultures to be stimulated while optical density and fluorescence are measured by the microplate reader. This protocol describes the full LEMOS 2.0 workflow, including device fabrication, circuit assembly, Arduino programming, PDMS microwell casting, plate-reader setup, strain and culture preparation, automated experiment execution, device cleanup, and fluorescence/OD600 data analysis. As a demonstration, the protocol uses the CcaSR optogenetic system, in which sfGFP expression is activated by green light and repressed by red light. LEMOS 2.0 is intended to make optogenetic perturbation and gene-expression characterization more accessible to wet-lab users, enabling faster design-build-test-learn cycles without requiring specialized bioreactor or microfluidic infrastructure.

19

A Spectrum of Possibilities: A Systematic Evaluation of Fluorescent Proteins in Cyanobacteria

Hasenklever, D.; Boecker, J.; Grankin, A.; Sener, F.; Axmann, I. M.; Behle, A.

2026-05-19 synthetic biology 10.64898/2026.05.18.725961 medRxiv

Top 0.2%

1.8%

Show abstract

Fluorescent reporters cover a wide range of applications in both basic and applied research. Whether a study involves microscopic imaging to study (co)-localization of proteins, FRET, biosensing, or quantifying gene expression, fluorophores are attractive reporter candidates due to their relatively straightforward in vivo readout. For microbiological applications, a wide variety of fluorescent proteins with varying excitation and emission wavelengths, brightness levels, and maturation times are available. Careful consideration is required when selecting from this large suite of proteins, especially when choosing multiple fluorophores. This is further complicated in phototrophic organisms, which exhibit strong autofluorescence, especially towards the red part of the spectrum, effectively eliminating common candidates such as mCherry. In this study, the specific properties and performance of a selection of fluorescent proteins are systematically evaluated against the background of photosynthetic pigment-derived autofluorescence in the cyanobacterium Synechocystis sp. PCC 6803. Specific readouts of different combinations of fluorescent proteins are also analyzed using high-throughput methods, namely plate reader fluorescent scans and single-cell flow cytometry to quantify fluorescence. The ultimate goal is to assess each fluorescent protein with regard to: 1.) Its ability to be discerned from cyanobacterial autofluorescence. 2.) Its compatibility with other fluorophores in this context. 3.) Its overall suitability in cyanobacterial research. Several highly suitable fluorescent proteins for use in cyanobacteria are identified, including mTagBFP2, mNeonGreen and mScarlet-I and suitable combinations, covering nearly the whole spectrum of visible light. This study expands the knowledge and toolset for current and future researchers and uncovers a whole spectrum of possibilities for fluorescent protein selection in cyanobacterial cell biology.

20

The widely used eGFP sequence produces an unintended protein product

Wang, Z.; Ma, H.; Mao, Y.; Ma, K.

2026-06-06 molecular biology 10.64898/2026.06.02.729456 medRxiv

Top 0.2%

1.8%

Show abstract

Plasmids are widely used for gene expression, yet their coding potential beyond the intended coding sequence (CDS) is often poorly characterized. Here, we explored putative "hidden open reading frames" (hidden ORFs) embedded within non-canonical reading frames of plasmid sequences through a computational workflow for their identification. Using enhanced green fluorescent protein (eGFP) as a target gene, we observed unexpectedly uninterrupted ORFs in both the +2 coding frame and the reverse frame. Immunoblotting detected stable expression of the +2 frame-derived protein, but not the reverse-frame ORF. Motivated by these observations, we developed a computational pipeline and analyzed 6,308 eGFP-containing plasmids, identifying putative hidden ORFs in approximately 21% of constructs. Approximately 25% of hidden ORFs occurred in the +2 frame, with the remainder occurring in the reverse frame. The same analytical pipeline, if utilized for plasmids beyond eGFP plasmids, can contribute to avoiding unintended outcomes, in applications such as gene replacement therapy.